Revisiting Batch Norm Initialization
نویسندگان
چکیده
Batch normalization (BN) is comprised of a component followed by an affine transformation and has become essential for training deep neural networks. Standard initialization each BN in network sets the scale shift to 1 0, respectively. However, after we have observed that these parameters do not alter much from their initialization. Furthermore, noticed process can still yield overly large values, which undesirable training. We revisit formulation present new method update approach address aforementioned issues. Experiments are designed emphasize demonstrate positive influence proper on performance, use rigorous statistical significance tests evaluation. The be used with existing implementations at no additional computational cost. Source code available https://github.com/osu-cvl/revisiting-bn-init .
منابع مشابه
Revisiting Norm Estimation in Data Streams
We revisit the problem of (1±ε)-approximating the Lp norm, 0 ≤ p ≤ 2, of an n-dimensional vector updated in a stream of length m with positive and negative updates to its coordinates. We give several new upper and lower bounds, some of which are optimal. LOWER BOUNDS.We show that for the interesting range of parameters, Ω(ε log(nm)) bits of space are necessary for estimating Lp in one pass for ...
متن کاملRevisiting Batch Normalization For Practical Domain Adaptation
Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection. However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain. Recent study (Tommasi et al., 2015) shows that a DNN has strong dependenc...
متن کاملAdjusting for Dropout Variance in Batch Normalization and Weight Initialization
We show how to adjust for the variance introduced by dropout with corrections to weight initialization and Batch Normalization, yielding higher accuracy. Though dropout can preserve the expected input to a neuron between train and test, the variance of the input differs. We thus propose a new weight initialization by correcting for the influence of dropout rates and an arbitrary nonlinearity’s ...
متن کاملRevisiting the Problem of Weight Initialization for Multi-Layer Perceptrons Trained with Back Propagation
One of the main reasons for the slow convergence and the suboptimal generalization results of MLP (Multilayer Perceptrons) based on gradient descent training is the lack of a proper initialization of the weights to be adjusted. Even sophisticated learning procedures are not able to compensate for bad initial values of weights, while good initial guess leads to fast convergence and or better gen...
متن کاملL1-Norm Batch Normalization for Efficient Training of Deep Neural Networks
Batch Normalization (BN) has been proven to be quite effective at accelerating and improving the training of deep neural networks (DNNs). However, BN brings additional computation, consumes more memory and generally slows down the training process by a large margin, which aggravates the training effort. Furthermore, the nonlinear square and root operations in BN also impede the low bit-width qu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2022
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-19803-8_13